Design and Implementation of an Improved K-Means Clustering Algorithm

نویسندگان

چکیده

Aiming at the problems of traditional K-means clustering algorithm, such as local optimal solution and slow speed caused by uncertainty k value randomness initial cluster center selection, this paper proposes an improved KMeans method. The algorithm first uses idea elbow rule based on sum squares errors to obtain appropriate number clusters k, then variance a measure degree dispersion samples, selects data points with smallest distance greater than average samples set. Finally, combined “triangular inequality principle,” unnecessary calculation in iterative process is reduced, operation efficiency improved. results show that tested UCI Compared k-means Canopy-KMeans accuracy speedup ratio are significantly improved, quality

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Improved K-Means with Artificial Bee Colony Algorithm for Clustering Crimes

Crime detection is one of the major issues in the field of criminology. In fact, criminology includes knowing the details of a crime and its intangible relations with the offender. In spite of the enormous amount of data on offenses and offenders, and the complex and intangible semantic relationships between this information, criminology has become one of the most important areas in the field o...

متن کامل

An Efficient k-Means Clustering Algorithm: Analysis and Implementation

ÐIn k-means clustering, we are given a set of n data points in d-dimensional space R and an integer k and the problem is to determine a set of k points in R, called centers, so as to minimize the mean squared distance from each data point to its nearest center. A popular heuristic for k-means clustering is Lloyd's algorithm. In this paper, we present a simple and efficient implementation of Llo...

متن کامل

Persistent K-Means: Stable Data Clustering Algorithm Based on K-Means Algorithm

Identifying clusters or clustering is an important aspect of data analysis. It is the task of grouping a set of objects in such a way those objects in the same group/cluster are more similar in some sense or another. It is a main task of exploratory data mining, and a common technique for statistical data analysis This paper proposed an improved version of K-Means algorithm, namely Persistent K...

متن کامل

An Improved K-means Algorithm for Clustering Categorical Data

Most of the earlier work on clustering is mainly focused on numerical data the inherent geometric properties of which can be exploited to naturally define distance functions between the data points. However, the computational cost makes most of the previous algorithms unacceptable for clustering very large databases. The k-means algorithm is well known for its efficiency in this respect. At the...

متن کامل

An Improved K-means Clustering Algorithm for Image Segmentation

Image segmentation is a primary step in many computer vision applications, whose purpose is to extract information from the images to allow the discrimination among different objects of interest. This task usually involves the partitioning of the image into a number of clusters, such that the data in each cluster share similar features. This work describes a new clustering algorithm for providi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Mobile Information Systems

سال: 2022

ISSN: ['1875-905X', '1574-017X']

DOI: https://doi.org/10.1155/2022/6041484